IN THE CLAIMS 

Please amend the claims as follows. 

For the Examiner's convenience, a list of all claims is included below. 

1. (Currently Amended) A machine-implemented method comprising: 
extracting portions from time domain speech segments, the portions surrounding 

a segment boundary within a phoneme; 

identifying time samples from the portions; 

creating feature vectors that represent the portions in a vector space, the feature 
vectors incorporating phase information of the portions, wherein the creating feature 
vectors comprises constructing a matrix W containing the time samples from the portions 
surrounding the segment boundary within the phoneme; and deriving feature vectors that 
represent the portions in a vector space by decomposing the matrix W containing the time 
samples from the portions surrounding the segment boundary within the phoneme , such 
that at least phase information of the portions is preserved in the feature vectors ; and 

determining a distance between the feature vectors in the vector space. 

2. (Canceled). 

3. (Previously Presented) The machine-implemented method of claim 1, wherein 
decomposing the matrix W comprises extracting global boundary-centric features from 
the portions. 
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4. (Previously Presented) The machine-implemented method of claim 1, wherein 
the speech segments each include the segment boundary within the phoneme. 

5. (Original) The machine-implemented method of claim 4, wherein the speech 
segments each include at least one diphone. 

6. (Original) The machine-implemented method of claim 5, wherein the portions 
include at least one pitch period. 

7. (Original) The machine-implemented method of claim 6, wherein decomposing 
the matrix W comprises performing a pitch synchronous singular value analysis on the 
pitch periods of the time-domain segments. 
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8. (Previously Presented) The machine-implemented method of claim 6, wherein 
the matrix W is a 2KM x matrix represented by 

W= UZV T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M is the number 
of segments in a voice table having a segment boundary within the phoneme, U is the 
2KM x R left singular matrix with row vectors «, (1 < i < 2KM),Zis the R x R diagonal 
matrix of singular values s\ > s 2 > . . . > sr > 0, V is the x R right singular matrix with 
row vectors vy (1 < j < AO, R « 2KM, and T denotes matrix transposition, wherein 
decomposing the matrix W comprises performing a singular value decomposition of W. 

9. (Original) The machine-implemented method of claim 8, wherein the pitch 
periods are zero padded to N samples. 

10. (Original) The machine-implemented method of claim 9, wherein a feature vector 
Ui is calculated as 

Ui = UiZ 

where Ui is a row vector associated with a pitch period i, and 27 is the singular diagonal 
matrix. 
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11. (Original) The machine-implemented method of claim 10, wherein the distance 
between two feature vectors is determined by a metric comprising the cosine of the angle 
between the two feature vectors. 



12. (Original) The machine-implemented method of claim 11, wherein the metric 
comprises a closeness measure, C, between two feature vectors, Uk and Hi , wherein C is 
calculated as 

C(u k , ui) = cos( W ^ ( uiE) = U 'f«'\„ 

Ml Ml 

for any 1< k, I < 2KM. 



13. (Original) The machine-implemented method of claim 12, wherein a difference 
d(Si,S2) between two segments in the voice table, Si and S2, is calculated as 

d(Si,S2) = doipi, q\) = 1 - C(u P i , u q i ) 
where do is the distance between pitch periods p\ and q\,p\ is the last pitch period of Si, 
qi is the first pitch period of S 2 , u P i is a feature vector associated with pitch period pi , 
and u q i is a feature vector associated with pitch period q\. 



14. (Original) The machine-implemented method of claim 13, wherein the 
calculation for the difference between two segments in the voice table, Si and S 2 , is 
expanded to include a plurality of pitch periods from each segment. 
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15. (Original) The machine-implemented method of claim 13, wherein the difference 
between two segments in the voice table, Si and S2, is associated with a discontinuity 
between Si and S2 . 

16. (Original) The machine- implemented method of claim 12, wherein a difference 

d(Si,S2) between two segments in the voice table, Si and S2, is calculated as 

d(S h S 2 ) = I d 0 (pu qi) - d 0 (pu p{) + d 0 (q u q {) \ = \ C(u P i , upi ) + C{ u q i ,Uqi)- C{ u P i ,u q i)\ 
2 2 

where d 0 is the distance between pitch periods, p x is the last pitch period of Si, p 1 is the 

first pitch period of a segment contiguous to Si , #1 is the first pitch period of S2 , q 1 is 

the last pitch period of a segment contiguous to S 2 , u P i is a feature vector associated with 

pitch period p\ , u q \ is a feature vector associated with pitch period q\ , upi is a feature 

vector associated with pitch period p 1 , and Uq\ is a feature vector associated with pitch 

period q x . 

17. (Previously Presented) The machine-implemented method of claim 1, further 
comprising associating the distance between the feature vectors with speech segments in 
a voice table. 

18. (Original) The machine-implemented method of claim 17, further comprising: 
selecting speech segments from the voice table based on the distance between the 

feature vectors. 
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19. (Original) The machine-implemented method of claim 5, wherein the portions 
include centered pitch periods. 

20. (Previously Presented) The machine-implemented method of claim 19, wherein 
the matrix Wis a (2(K-l)+l)M x Af matrix represented by 

w= UZV T 

where K-l is the number of centered pitch periods near the segment boundary extracted 
from each segment, ./V is the maximum number of samples among the centered pitch 
periods, M is the number of segments in a voice table having a segment boundary within 
the phoneme, U is the {2{K-\)+\)M x R left singular matrix with row vectors 
Ui (1 < i< (2(K-\)+\)M), Zis the R x R diagonal matrix of singular values s\> si> ... 
> s R > 0, Vis the N x R right singular matrix with row vectors y,- (1 < j < N), R « (2(K- 
1)+1)M), and T denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 

21. (Original) The machine-implemented method of claim 20, wherein the centered 
pitch periods are symmetrically zero padded to iV samples. 

22. (Original) The machine-implemented method of claim 21, wherein a feature 
vector Ui is calculated as 

Ui = UiZ 

where m is a row vector associated with a centered pitch period i, and E is the singular 
diagonal matrix. 
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23. (Original) The machine-implemented method of claim 22, wherein the distance 
between two feature vectors is determined by a metric comprising a closeness measure, 
C, between two feature vectors, and Hi , wherein C is calculated as 

.. v 2 .. t 

C(uk , ui) = cos(uk£, ml) - 



IMI Ml 

for any 1< k,l< {2{K-Y)+Y)M. 

24. (Original) The machine-implemented method of claim 23, wherein a difference 
d(Si,S 2 ) between two segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = C(Ux-i,uSo) + C(uSo,Ua 1 )-C(U7r- 1 ,U7r 0 )-C(U(To,U(T l ) 

where U 7t-\ is a feature vector associated with a centered pitch period 7t-\ ,uSo is a 
feature vector associated with a centered pitch period So ,UQ\ is a feature vector 
associated with a centered pitch period <Ji , U7to is a feature vector associated with a 
centered pitch period 71 q , and U(jo is a feature vector associated with a centered pitch 
period (Jo . 

25. (Currently Amended) A machine-readable medium having instructions to cause a 
machine to perform a machine-implemented method comprising: 

extracting portions from time domain speech segments that surround a segment 

boundary within a phoneme; 

identifying time samples from the portions; 
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creating feature vectors that represent the portions in a vector space, the feature 
vectors incorporating phase information of the portions, wherein the creating feature 
vectors comprises constructing a matrix W containing the time samples from the portions 
surrounding the segment boundary within the phoneme; and deriving feature vectors that 
represent the portions in a vector space by decomposing the matrix W containing the time 
samples from the portions surrounding the segment boundary within the phoneme , such 
that at least phase information of the portions is preserved in the feature vectors ; and 

determining a distance between the feature vectors in the vector space. 

26. (Canceled). 

27. (Previously Presented) The machine-readable medium of claim 25, wherein 
decomposing the matrix W comprises extracting global boundary-centric features from 
the portions. 

28. (Previously Presented) The machine-readable medium of claim 25, wherein the 
speech segments each include the segment boundary within the phoneme. 

29. (Original) The machine-readable medium of claim 28, wherein the speech 
segments each include at least one diphone. 

30. (Original) The machine-readable medium of claim 29, wherein the portions 
include at least one pitch period. 
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31. (Original) The machine-readable medium of claim 30, wherein decomposing the 
matrix W comprises performing a pitch synchronous singular value analysis on the pitch 
periods of the time-domain segments. 

32. (Previously Presented) The machine-readable medium of claim 30, wherein the 
matrix Wis a 2KM x N matrix represented by 

W= UEV T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M is the number 
of segments in a voice table having a segment boundary within the phoneme, U is the 
2KM x R left singular matrix with row vectors (1 < i< 2KM),Zis the R x R diagonal 
matrix of singular values si> s 2 > ■ > s R > 0, V is the N x R right singular matrix with 
row vectors vy (1 < j < N), R « 2KM, and T denotes matrix transposition, wherein 
decomposing the matrix W comprises performing a singular value decomposition of W. 

33. (Original) The machine-readable medium of claim 32, wherein the pitch periods 
are zero padded to N samples. 

34. (Original) The machine-readable medium of claim 33, wherein a feature vector 
is calculated as 

Ui = UiE 
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where u t is a row vector associated with a pitch period i, and E is the singular diagonal 
matrix. 



35. (Original) The machine-readable medium of claim 34, wherein the distance 
between two feature vectors is determined by a metric comprising the cosine of the angle 
between the two feature vectors. 



36. (Original) The machine-readable medium of claim 35, wherein the metric 
comprises a closeness measure, C, between two feature vectors, u k and Ui , wherein C is 
calculated as 

Ml Ik* II 

for any 1< k,l< 2KM. 

37. (Original) The machine-readable medium of claim 36, wherein a difference 
d(Si,S2) between two segments in the voice table, Si and 52, is calculated as 

d(S h S 2 ) = d 0 (pu qi)=l- C(u P i , u q i ) 
where d Q is the distance between pitch periods p x and q u pi is the last pitch period of Su 
qi is the first pitch period of S 2 , u v \ is a feature vector associated with pitch period p x , 
and u q i is a feature vector associated with pitch period q\. 
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38. (Original) The machine-readable medium of claim 37, wherein the calculation for 
the difference between two segments in the voice table, Si and S2, is expanded to include 
a plurality of pitch periods from each segment. 

39. (Original) The machine-readable medium of claim 37, wherein the difference 
between two segments in the voice table, Si and S2, is associated with a discontinuity 
between Si and S 2 . 

40. (Original) The machine-readable medium of claim 36, wherein a difference 

<i(Si,S 2 ) between two segments in the voice table. S, and S 2 . is calculated as 

d(S h S 2 ) = I d 0 (p u q{) - d 0 (p u P 1) + d 0 (.q u 1 1) | = | C{ u P i ,upi) + C( u q \ , Uqi ) - C( u P \ ,u q i)\ 
2 2 

where do is the distance between pitch periods, p\ is the last pitch period of Si , p 1 is the 

first pitch period of a segment contiguous to Si , q\ is the first pitch period of S2 , q 1 is 

the last pitch period of a segment contiguous to S 2 , u P i is a feature vector associated with 

pitch period pi , u q \ is a feature vector associated with pitch period q\ , upi is a feature 

vector associated with pitch period p 1 , and Uqi is a feature vector associated with pitch 

period q l . 

41. (Previously Presented) The machine-readable medium of claim 25, wherein the 
method further comprises associating the distance between the feature vectors with 
speech segments in a voice table. 
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42. (Original) The machine-readable medium of claim 41, wherein the method 
further comprises: 

selecting speech segments from the voice table based on the distance between the 
feature vectors. 

43. (Original) The machine-readable medium of claim 29, wherein the portions 
include centered pitch periods. 

44. (Previously Presented) The machine-readable medium of claim 43, wherein the 
matrix Wis a (2{K- 1)+1)M x N matrix represented by 

w= UZV T 

where £"-1 is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, M is the number of segments in a voice table having a segment boundary within 
the phoneme, U is the (2(A'-1)+1)M x R left singular matrix with row vectors 
Ui (1 < i < {2{K-\)+Y)M), 27 is the R x R diagonal matrix of singular values s\> s 2 > ... 
> s R > 0, Vis the N x R right singular matrix with row vectors vj (1 < j < N), R « (2(K- 
1)+1)M), and T denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 

45. (Original) The machine-readable medium of claim 44, wherein the centered pitch 
periods are symmetrically zero padded to N samples. 
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46. (Original) The machine-readable medium of claim 45, wherein a feature vector Ui 
is calculated as 

Ui = UiZ 

where u t is a row vector associated with a centered pitch period i, and 24s the singular 
diagonal matrix. 

47. (Original) The machine-readable medium of claim 46, wherein the distance 
between two feature vectors is determined by a metric comprising a closeness measure, 
C, between two feature vectors, Uk and Hi , wherein C is calculated as 

C(u k , ui) = cos(u k Z, uiZ) = |j — Uk ,, 1, 

for any 1< k,l< (2(K-1 )+l)M. 

48. (Original) The machine-readable medium of claim 47, wherein a difference 
d(ShS2) between two segments in the voice table, Si and 52, is calculated as 

d(S h S 2 ) = C{U 7t-\ ,U Sn) + C{U Sq,Uo\) ~ C{U7t-i,U7t Q ) - C{U(j Q ,U (ji) 
where U7I-1 is a feature vector associated with a centered pitch period 71 -1 ,uSo is a 
feature vector associated with a centered pitch period So ,U(ji is a feature vector 
associated with a centered pitch period <Ji , Ujio is a feature vector associated with a 
centered pitch period 71 0 , and U <j 0 is a feature vector associated with a centered pitch 
period (Jo . 
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49. (Currently Amended) An apparatus comprising: 

means for extracting portions from time domain speech segments, the portions 
surrounding a segment boundary within a phoneme; 

means for identifying time samples from the portions; 

means for creating feature vectors that represent the portions in a vector space, the 
feature vectors incorporating phase information of the portions, wherein means for 
creating feature vectors comprises means for constructing a matrix W containing the time 
samples from the portions surrounding the segment boundary within the phoneme; and 
means for deriving feature vectors that represent the portions in a vector space by 
decomposing the matrix W containing the time samples from the the portions surrounding 
the segment boundary within the phoneme , such that at least phase information of the 
portions is preserved in the feature vectors ; and 

means for determining a distance between the feature vectors in the vector space. 

50. (Canceled). 

51. (Previously Presented) The apparatus of claim 49, wherein the means for 
decomposing the matrix W comprises means for extracting global boundary-centric 
features from the portions. 

52. (Previously Presented) The apparatus of claim 49, wherein the speech segments 
each include the segment boundary within the phoneme. 
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53. (Original) The apparatus of claim 52, wherein the speech segments each include 
at least one diphone. 

54. (Original) The apparatus of claim 53, wherein the portions include at least one 
pitch period. 

55. (Original) The apparatus of claim 54, wherein the means for decomposing the 
matrix W comprises means for performing a pitch synchronous singular value analysis on 
the pitch periods of the time-domain segments. 

56. (Previously Presented) The apparatus of claim 54, wherein the matrix Wis a 
2KM x N matrix represented by 

W= UZV T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, TV is the maximum number of samples among the pitch periods, M is the number 
of segments in a voice table having a segment boundary within the phoneme, U is the 
2KM x R left singular matrix with row vectors m (1 < i< 2KM),Zis the R x R diagonal 
matrix of singular values si > S2> ■■■> sr > 0, V is the x R right singular matrix with 
row vectors v 7 (1 < j < N), R « 2KM, and T denotes matrix transposition, wherein 
decomposing the matrix W comprises performing a singular value decomposition of W. 
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57. (Original) The apparatus of claim 56, wherein the pitch periods are zero padded 
to TV samples. 

58. (Original) The apparatus of claim 57, wherein a feature vector w, is calculated as 

Hi = UiZ 

where Ui is a row vector associated with a pitch period i, and Z is the singular diagonal 
matrix. 

59. (Original) The apparatus of claim 58, wherein the distance between two feature 
vectors is determined by a metric comprising the cosine of the angle between the two 
feature vectors. 



60. (Original) The apparatus of claim 59, wherein the metric comprises a closeness 
measure, C, between two feature vectors, Uk and ui , wherein C is calculated as 

C(u k , ui) = cos^ uiZ) = u " " ; 

Ml U*\ 

for any 1< k, I < 2KM. 
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61. (Original) The apparatus of claim 60, wherein a difference d(Si,S 2 ) between two 
segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = doipu qi)=l- C(u P i , u q i ) 
where d 0 is the distance between pitch periods p x and q\,p\ is the last pitch period of S\, 
qi is the first pitch period of S 2 , Upi is a feature vector associated with pitch period pi , 
and u q i is a feature vector associated with pitch period q x . 

62. (Original) The apparatus of claim 61, wherein the calculation for the difference 
between two segments in the voice table, Si and S 2 , is expanded to include a plurality of 
pitch periods from each segment. 

63. (Original) The apparatus of claim 61, wherein the difference between two 
segments in the voice table, Si and 52, is associated with a discontinuity between Si and 
S 2 . 

64. (Original) The apparatus of claim 60, wherein a difference d(Si,S 2 ) between two 
segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = | d 0 (pi, q{) - d n (p u p 0 + d 0 (qi,q i) | = | C( u P i ,upi) + C( u q i ,Uqi)- C{ u P i ,u q i)\ 
2 2 

where do is the distance between pitch periods, pi is the last pitch period of Si , p i is the 

first pitch period of a segment contiguous to Si , qi is the first pitch period of 5*2 , q i is 

the last pitch period of a segment contiguous to S 2 , u P i is a feature vector associated with 

pitch period p x , u q i is a feature vector associated with pitch period q x , upi is a feature 
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vector associated with pitch period p x , and Uqi is a feature vector associated with pitch 
period q x . 

65. (Previously Presented) The apparatus of claim 49, further comprising means for 
associating the distance between the feature vectors with speech segments in a voice 
table. 

66. (Original) The apparatus of claim 65, further comprising: 

means for selecting speech segments from the voice table based on the distance 
between the feature vectors. 

67. (Original) The apparatus of claim 53, wherein the portions include centered pitch 
periods. 

68. (Previously Presented) The apparatus of claim 67, wherein the matrix Wis a 
{2{K-\)+\)M x N matrix represented by 

w= UZV T 

where K-l is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, M is the number of segments in a voice table having a segment boundary within 
the phoneme, U is the (2(^-1 )+l)Af x R left singular matrix with row vectors 
Ui (1 < i< (2(K-1)+1)M), Z is the R x R diagonal matrix of singular values s\> S2> ... 
> s R > 0, V is the TV x R right singular matrix with row vectors Vj (1 < j < AO, R « (2(K- 
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1)+1)M), and T denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 



69. (Original) The apparatus of claim 68, wherein the centered pitch periods are 
symmetrically zero padded to N samples. 

70. (Original) The apparatus of claim 69, wherein a feature vector is calculated as 

Ui = u t Z 

where u t is a row vector associated with a centered pitch period i, and Z is the singular 
diagonal matrix. 

71. (Original) The apparatus of claim 70, wherein the distance between two feature 
vectors is determined by a metric comprising a closeness measure, C, between two 
feature vectors, Uk and Ui , wherein C is calculated as 

C(u k , ui) = cos(w*2; uiE) = u " " ; 

ll"^ll u*\ 

for any 1< k,l< {2{K-l )+l)M. 

72. (Original) The apparatus of claim 71, wherein a difference d(S\,S2) between two 
segments in the voice table, Si and S 2 , is calculated as 

d(SuS 2 ) = C(U 7l-i , U So) + C( USo, U(Ji) - C(U 2T-i ,U7ta) ~ C(U(Jo,U(Ji) 
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where U 71 -i is a feature vector associated with a centered pitch period 7t-\ , U So is a 
feature vector associated with a centered pitch period So ,U(ji is a feature vector 
associated with a centered pitch period <7i,U7i 0 is a feature vector associated with a 
centered pitch period 7lo , and U <j 0 is a feature vector associated with a centered pitch 
period (Jo . 

73. (Currently Amended) A system comprising: 

a processing unit coupled to a memory through a bus; and 
wherein the processing unit is configured, for a process, to extract portions from time 
domain speech segments, the portions surrounding a segment boundary within a 
phoneme, identify time samples from the portions; create feature vectors that represent 
the portions in a vector space, wherein the processing unit is configured, when creating 
feature vectors, to construct a matrix W containing the time samples from the portions 
surrounding the segment boundary within the phoneme, and derive feature vectors that 
represent the portions in a vector space by decompose decomposing the matrix W 
containing the time samples from the portions surrounding the segment boundary within 
the phoneme, such that the feature vectors incoip orating at least phase information of the 
portions is preserved in the feature vectors , and determine a distance between the feature 
vectors in the vector space. 

74. (Canceled). 
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75. (Previously Presented) The system of claim 73, wherein the process further 
causes the processing unit, when decomposing the matrix W, to extract global boundary- 
centric features from the portions. 

76. (Previously Presented) The system of claim 73, wherein the speech segments 
each include the segment boundary within the phoneme. 

77. (Original) The system of claim 76, wherein the speech segments each include at 
least one diphone. 

78. (Original) The system of claim 77, wherein the portions include at least one pitch 
period. 

79. (Original) The system of claim 78, wherein the process further causes the 
processing unit, when decomposing the matrix W, to perform a pitch synchronous 
singular value analysis on the pitch periods of the time-domain segments. 
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80. (Previously Presented) The system of claim 78, wherein the matrix Wis a 2KM x 
N matrix represented by 

w= UZV T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M is the number 
of segments in a voice table having a segment boundary within the phoneme, U is the 
2KM x R left singular matrix with row vectors m (1 < i< 2KM),Z\s the R x R diagonal 
matrix of singular values s\> s 2 > . . . > £r > 0, V is the N x R right singular matrix with 
row vectors vj (1 < j< N), R « 2KM, and T denotes matrix transposition, wherein 
decomposing the matrix W comprises performing a singular value decomposition of W. 

81. (Original) The system of claim 80, wherein the pitch periods are zero padded to 
N samples. 

82. (Original) The system of claim 81, wherein a feature vector Ui is calculated as 

Ui = UiZ 

where is a row vector associated with a pitch period /, and Z is the singular diagonal 
matrix. 

83. (Original) The system of claim 82, wherein the distance between two feature 
vectors is determined by a metric comprising the cosine of the angle between the two 
feature vectors. 
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84. (Original) The system of claim 83, wherein the metric comprises a closeness 
measure, C, between two feature vectors, Uk and Ui , wherein C is calculated as 

C(u k , «/) = cos(u k Z, uiE) = |j — Uk n |p 17 

Ml u*\ 

for any 1< Jfc, / < 2KM. 



85. (Original) The system of claim 84, wherein a difference d(Si,S 2 ) between two 
segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = do(pi, qi) = 1 - C{u P \ , «?i ) 
where d 0 is the distance between pitch periods p x and q u pi is the last pitch period of Si, 
qi is the first pitch period of S 2 , Upi is a feature vector associated with pitch period p\ , 
and Uqi is a feature vector associated with pitch period q\. 



86. (Original) The system of claim 85, wherein the calculation for the difference 
between two segments in the voice table, Si and S 2 , is expanded to include a plurality of 
pitch periods from each segment. 



87. (Original) The system of claim 85, wherein the difference between two segments 
in the voice table, Si and S 2 , is associated with a discontinuity between Si and S 2 . 
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88. (Original) The system of claim 84, wherein a difference d(Si,S 2 ) between two 
segments in the voice table, Si and S2, is calculated as 

d(S h S 2 ) = I d 0 (p u q{) - d 0 (p u p{) + d 0 (qi,q 1) | = | C(u P i , m) + C( u q i , Uqi ) - C( u P i ,u q i)\ 
2 2 

where do is the distance between pitch periods, pi is the last pitch period of Si , p 1 is the 

first pitch period of a segment contiguous to Si , q\ is the first pitch period of S 2 , q 1 is 

the last pitch period of a segment contiguous to S 2 , u P i is a feature vector associated with 

pitch period pi , u q i is a feature vector associated with pitch period qi , upi is a feature 

vector associated with pitch period p 1 , and is a feature vector associated with pitch 

period q 1 . 

89. (Previously Presented) The system of claim 74, wherein the process further 
causes the processing unit to associate the distance between the feature vectors with 
speech segments in a voice table. 

90. (Original) The system of claim 89, wherein the process further causes the 
processing unit to select speech segments from the voice table based on the distance 
between the feature vectors. 

91. (Original) The system of claim 77, wherein the portions include centered pitch 
periods. 
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92. (Previously Presented) The system of claim 91, wherein the matrix Wis a 
(2(K-1)+1)M x TV matrix represented by 

W= UZV T 

where K-l is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, M is the number of segments in a voice table having a segment boundary within 
the phoneme, U is the (2(^-1 )+l)M x R left singular matrix with row vectors 
Ui (1 < i < (2(K-l)+l)M), 27 is the R x R diagonal matrix of singular values si > s 2 > ... 
> s R > 0, Vis the N x R right singular matrix with row vectors vj (1 < j < AO, R « (2(K- 
1)+1)M), and T denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 

93. (Original) The system of claim 92, wherein the centered pitch periods are 
symmetrically zero padded to N samples. 

94. (Original) The system of claim 93, wherein a feature vector ut is calculated as 

ui = UiZ 

where m is a row vector associated with a centered pitch period i, and 27 is the singular 
diagonal matrix. 

95. (Original) The system of claim 94, wherein the distance between two feature 
vectors is determined by a metric comprising a closeness measure, C, between two 
feature vectors, Uk and Hi , wherein C is calculated as 
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C(uk , ui) = cos(iik£, uiE) = 




for any 1< k,l< (2(K-1)+1)M. 

96. (Original) The system of claim 95, wherein a difference <i(Si,S 2 ) between two 
segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = C( U 7l-i , U So ) + C( W £o , W <Ji ) - C( W 7l-i , U 71 o ) - C( W <J 0 , W <Ji ) 
where is a feature vector associated with a centered pitch period 71 -i ,USo is a 
feature vector associated with a centered pitch period So ,U<ji is a feature vector 
associated with a centered pitch period (Ji , U7I0 is a feature vector associated with a 
centered pitch period 71 0 , and U<jo is a feature vector associated with a centered pitch 
period (To . 

97. (Currently Amended) A machine- implemented method comprising: 
gathering time-domain samples from recorded speech segments, wherein the 

time-domain samples include time samples of pitch periods surrounding a segment 
boundary within a phoneme; 

extracting features that represent the time domain samples, wherein the extracting 
features comprises constructing a matrix containing the time domain samples of the pitch 
periods surrounding the segment boundary within the phoneme and deriving feature 
vectors that represent the time samples in a vector space by decomposing the matrix 
containing the time domain samples of the pitch periods surrounding the segment 
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boundary within the phoneme , such that at least phase information of the time samples is 
preserved in the feature vectors ; 

determining a discontinuity between the segments, the discontinuity based on a 
distance between the features. 

98. (Canceled). 

99. (Previously Presented) The machine-implemented method of claim 97, wherein 
the features incorporate phase information of the pitch periods. 

100. (Canceled). 

101. (Currently Amended) A machine-readable medium having instructions to cause a 
machine to perform a machine-implemented method comprising: 

gathering time-domain samples from recorded speech segments, wherein the 
time-domain samples include time samples of pitch periods surrounding a segment 
boundary within a phoneme; 

extracting features that represent th e time domain samples, wherein the extracting 
features comprises constructing a matrix containing the time domain samples of the pitch 
periods surrounding the segment boundary within the phoneme and deriving feature 
vectors that represent the time samples in a vector space by decomposing the matrix 
containing the time domain samples of the pitch periods surrounding the segment 
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boundary within the phoneme , such that at least phase information of the time samples is 
preserved in the feature vectors ; 

determining a discontinuity between the segments, the discontinuity based on a 
distance between the features. 

102. (Canceled). 

103. (Previously Presented) The machine-readable medium of claim 101, wherein the 
features incorporate phase information of the pitch periods. 

104. (Canceled). 

105. (Currently Amended) An apparatus comprising: 

means for gathering time-domain samples from recorded speech segments, 
wherein the time-domain samples include time samples of pitch periods surrounding a 
segment boundary within a phoneme; 

means for extracting features that represent the time domain samples, wherein the 
means for extracting features compris e s means for constructing a matrix containing the 
time domain samples of the pitch periods surrounding the segment boundary within the 
phoneme and deriving feature vectors that represent the time samples in a vector space by 
decomposing the matrix containing the time domain samples of the pitch periods 
surrounding the segment boundary within the phoneme , such that at least phase 
information of the time samples is preserved in the feature vectors ; 



Appl. No. 10/693,227 29/37 
Amdt. dated December 8, 2008 



means for determining a discontinuity between the segments, the discontinuity 
based on a distance between the features. 

106. (Canceled). 

107. (Previously Presented) The apparatus of claim 105, wherein the features 
incorporate phase information of the pitch periods. 

108. (Canceled). 

109. (Currently Amended) A system comprising: 

a processing unit coupled to a memory through a bus; and 
a process executed from the memory by the processing unit to cause the processing unit 
to gather time-domain samples from recorded speech segments, wherein the time-domain 
samples include time samples of pitch periods surrounding a segment boundary within a 
phoneme, extract features that represent the time domain samples, wherein the extracting 
features comprises constructing a matrix containing the time-domain samples of the pitch 
periods surrounding the segment boundary within the phoneme and deriving feature 
vectors that represent the time samples in a vector space by decomposing the matrix 
containing the time domain samples of the pitch periods surrounding the segment 
boundary within the phoneme, such that at least phase information of the time samples is 
preserved in the feature vectors; and determine a discontinuity between the segments, the 
discontinuity based on a distance between the features. 
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110. (Canceled). 



111. (Previously Presented) The system of claim 109, wherein the features incorporate 
phase information of the pitch periods. 

112. (Canceled). 
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